Cluster Computing Paradigms– A Comparative study of Evolving Frameworks
نویسندگان
چکیده
Cluster computing is an approach for storing and processing huge amount of data that is being generated. Hadoop and Spark are the two cluster computing platforms which are prominent today. Hadoop incorporates the MapReduce concept and is scalable as well as fault-tolerant. But the limitations of Hadoop paved way for another cluster computing framework named Spark. It is faster and can also manage multiple workloads due to its inmemory processing. In this paper, we discuss the underlying concepts of Hadoop and mention the limitations that led to the development of Spark. Further we give a detailed description about Spark framework and its advantages. We demonstrate a wordcount problem in both Hadoop and Spark and do a comparative study.
منابع مشابه
Comparative survey between quantitative and qualitative paradigms (part II)
As stated in the first part of article, we have stated the four major philosophical paradigms ‎which make up the basis for knowledge(epistemology), the nature and reality(ontology) and ‎the acquisition methods of knowledge(methodology). Thus, according to each paradigm, ‎approach to knowledge is determined. ‎‏ ‏ In a more general category, we have two quantitative and qu...
متن کاملOn the Viability of Component Frameworks for High Performance Distributed Computing: A Case Study
Software infrastructures that support metacomputing are evolving from traditional monolithic, platform-specific systems to component and service-based frameworks. In this paper we demonstrate that contrary to popular belief, such modular software systems are capable of delivering good to excellent performance, support legacy as well as new application programming paradigms, and deliver enhanced...
متن کاملبررسی مقایسه ای چارچوب های ارزیابی عملکرد نظام سلامت در جهان
Background: Need to assess the health system performance, various models and frameworks have been developed by different groups and organizations. This study explores health system performance assessment frameworks using the comparative-analytical study. Materials and Methods: This is a comparative-descriptive study conducted using descriptive-prescriptive method based on comprehensive com...
متن کاملSurvey and Performance Evaluation of DBSCAN Spatial Clustering Implementations for Big Data and High-Performance Computing Paradigms
Big data is often mined using clustering algorithms. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a popular spatial clustering algorithm. However, it is computationally expensive and thus for clustering big data, parallel processing is required. The two prevalent paradigms for parallel processing are High-Performance Computing (HPC) based on Message Passing Interface ...
متن کاملA COMPARATIVE STUDY OF TRADITIONAL AND INTELLIGENCE SOFT COMPUTING METHODS FOR PREDICTING COMPRESSIVE STRENGTH OF SELF – COMPACTING CONCRETES
This study investigates the prediction model of compressive strength of self–compacting concrete (SCC) by utilizing soft computing techniques. The techniques consist of adaptive neuro–based fuzzy inference system (ANFIS), artificial neural network (ANN) and the hybrid of particle swarm optimization with passive congregation (PSOPC) and ANFIS called PSOPC–ANFIS. Their perf...
متن کامل